social scientist
Text Annotation via Inductive Coding: Comparing Human Experts to LLMs in Qualitative Data Analysis
Parfenova, Angelina, Marfurt, Andreas, Denzler, Alexander, Pfeffer, Juergen
This paper investigates the automation of qualitative data analysis, focusing on inductive coding using large language models (LLMs). Unlike traditional approaches that rely on deductive methods with predefined labels, this research investigates the inductive process where labels emerge from the data. The study evaluates the performance of six open-source LLMs compared to human experts. As part of the evaluation, experts rated the perceived difficulty of the quotes they coded. The results reveal a peculiar dichotomy: human coders consistently perform well when labeling complex sentences but struggle with simpler ones, while LLMs exhibit the opposite trend. Additionally, the study explores systematic deviations in both human and LLM generated labels by comparing them to the golden standard from the test set. While human annotations may sometimes differ from the golden standard, they are often rated more favorably by other humans. In contrast, some LLMs demonstrate closer alignment with the true labels but receive lower evaluations from experts.
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
- Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
- (3 more...)
Sorrel: A simple and flexible framework for multi-agent reinforcement learning
Gelpí, Rebekah A., Ju, Yibing, Jackson, Ethan C., Tang, Yikai, Verch, Shon, Voelcker, Claas, Cunningham, William A.
We introduce Sorrel (https://github.com/social-ai-uoft/sorrel), a simple Python interface for generating and testing new multi-agent reinforcement learning environments. This interface places a high degree of emphasis on simplicity and accessibility, and uses a more psychologically intuitive structure for the basic agent-environment loop, making it a useful tool for social scientists to investigate how learning and social interaction leads to the development and change of group dynamics. In this short paper, we outline the basic design philosophy and features of Sorrel.
- North America > Canada > Ontario > Toronto (0.16)
- North America > United States > Illinois > Cook County > Chicago (0.04)
The Risks of Using Large Language Models for Text Annotation in Social Science Research
Generative artificial intelligence (GenAI) or large language models (LLMs) have the potential to revolutionize computational social science, particularly in automated textual analysis. In this paper, we conduct a systematic evaluation of the promises and risks of using LLMs for diverse coding tasks, with social movement studies serving as a case example. We propose a framework for social scientists to incorporate LLMs into text annotation, either as the primary coding decision-maker or as a coding assistant. This framework provides tools for researchers to develop the optimal prompt, and to examine and report the validity and reliability of LLMs as a methodological tool. Additionally, we discuss the associated epistemic risks related to validity, reliability, replicability, and transparency. We conclude with several practical guidelines for using LLMs in text annotation tasks, and how we can better communicate the epistemic risks in research.
- Europe > Middle East (0.05)
- Asia > Middle East (0.05)
- Africa > Middle East (0.05)
- (5 more...)
SCALE: Towards Collaborative Content Analysis in Social Science with Large Language Model Agents and Human Intervention
Zhao, Chengshuai, Tan, Zhen, Wong, Chau-Wai, Zhao, Xinyan, Chen, Tianlong, Liu, Huan
Content analysis breaks down complex and unstructured texts into theory-informed numerical categories. Particularly, in social science, this process usually relies on multiple rounds of manual annotation, domain expert discussion, and rule-based refinement. In this paper, we introduce SCALE, a novel multi-agent framework that effectively $\underline{\textbf{S}}$imulates $\underline{\textbf{C}}$ontent $\underline{\textbf{A}}$nalysis via $\underline{\textbf{L}}$arge language model (LLM) ag$\underline{\textbf{E}}$nts. SCALE imitates key phases of content analysis, including text coding, collaborative discussion, and dynamic codebook evolution, capturing the reflective depth and adaptive discussions of human researchers. Furthermore, by integrating diverse modes of human intervention, SCALE is augmented with expert input to further enhance its performance. Extensive evaluations on real-world datasets demonstrate that SCALE achieves human-approximated performance across various complex content analysis tasks, offering an innovative potential for future social science research.
- North America > United States > North Carolina (0.04)
- North America > United States > Florida > Orange County > Orlando (0.04)
- North America > United States > Texas > Bexar County > San Antonio (0.04)
- (9 more...)
New database features 250 AI tools that can enhance social science research
AI – or artificial intelligence – is often used as a way to summarize data and improve writing. But AI tools also represent a powerful and efficient way to analyze large amounts of text to search for patterns. In addition, AI tools can assist with developing research products that can be shared widely. It's with that in mind that we, as researchers in social science, developed a new database of AI tools for the field. In the database, we compiled information about each tool and documented whether it was useful for literature reviews, data collection and analyses, or research dissemination.
Detecting Mode Collapse in Language Models via Narration
No two authors write alike. Personal flourishes invoked in written narratives, from lexicon to rhetorical devices, imply a particular author--what literary theorists label the implied or virtual author; distinct from the real author or narrator of a text. Early large language models trained on unfiltered training sets drawn from a variety of discordant sources yielded incoherent personalities, problematic for conversational tasks but proving useful for sampling literature from multiple perspectives. Successes in alignment research in recent years have allowed researchers to impose subjectively consistent personae on language models via instruction tuning and reinforcement learning from human feedback (RLHF), but whether aligned models retain the ability to model an arbitrary virtual author has received little scrutiny. By studying 4,374 stories sampled from three OpenAI language models, we show successive versions of GPT-3 suffer from increasing degrees of "mode collapse" whereby overfitting the model during alignment constrains it from generalizing over authorship: models suffering from mode collapse become unable to assume a multiplicity of perspectives. Our method and results are significant for researchers seeking to employ language models in sociological simulations.
- North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
- North America > United States > New York > Tompkins County > Ithaca (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (5 more...)
Is Machine Learning Unsafe and Irresponsible in Social Sciences? Paradoxes and Reconsidering from Recidivism Prediction Tasks
Initially, those scholars employ these historical elements to forecast whether the criminal would re-offend. Subsequently, the binary outcome of recidivism serves as a proxy variable for recidivism risk. Some computer scientists also employ the probability (or score) assigned by the model for an offender's likelihood of re-offense as a gauge for their recidivism risk (Etzler et al., 2023; Ma et al., 2022; Wang et al., 2022). While such configurations may seem intuitively compelling, they often embody an oversimplified and deterministic viewpoint, which stands in contradiction to contemporary social science theories. Firstly, historical factors alone are insufficient predictors of human actions.
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
- Government (0.93)
- Education (0.93)
- (3 more...)
Investigative Pattern Detection Framework for Counterterrorism
Muramudalige, Shashika R., Hung, Benjamin W. K., Libretti, Rosanne, Klausen, Jytte, Jayasumana, Anura P.
Law-enforcement investigations aimed at preventing attacks by violent extremists have become increasingly important for public safety. The problem is exacerbated by the massive data volumes that need to be scanned to identify complex behaviors of extremists and groups. Automated tools are required to extract information to respond queries from analysts, continually scan new information, integrate them with past events, and then alert about emerging threats. We address challenges in investigative pattern detection and develop an Investigative Pattern Detection Framework for Counterterrorism (INSPECT). The framework integrates numerous computing tools that include machine learning techniques to identify behavioral indicators and graph pattern matching techniques to detect risk profiles/groups. INSPECT also automates multiple tasks for large-scale mining of detailed forensic biographies, forming knowledge networks, and querying for behavioral indicators and radicalization trajectories. INSPECT targets human-in-the-loop mode of investigative search and has been validated and evaluated using an evolving dataset on domestic jihadism.
- North America > United States > Colorado (0.06)
- North America > United States > New York (0.04)
- North America > United States > Michigan (0.04)
- (3 more...)
Do YOU have what it takes? Scientists reveal personality checklist for people who could colonize Mars
Bad news for those who struggle with anxiety, get too competitive, or simply choke under pressure: new research suggests that you may have to stay at home on Earth while other, more laid back and'agreeable' types colonize Mars. The new study, which is still undergoing peer review, ran computer simulations tracking the progress of human settlements on the Red Planet through their first 28 years of virtual operation. 'Agreeable personality types were assessed to be the most enduring for the long term,' the researchers found, across all four of the personality types used in their simulations, 'whereas neurotics showed least adaptation capacity.' The researchers also discovered that the minimum number of settlers needed to successfully operate a human colony on Mars was much lower than previously expected: just 22 people. 'Contrary to other literature,' they wrote of their simulated Martian colonies, 'the minimum number of people with all personality types that can lead to a sustainable settlement is in the tens and not hundreds.'
'There was all sorts of toxic behaviour': Timnit Gebru on her sacking by Google, AI's dangers and big tech's biases
'It feels like a gold rush," says Timnit Gebru. "In fact, it is a gold rush. And a lot of the people who are making money are not the people actually in the midst of it. But it's humans who decide whether all this should be done or not. We should remember that we have the agency to do that." Gebru is talking about her specialised field: artificial intelligence. On the day we speak via a video call, she is in Kigali, Rwanda, preparing to host a workshop and chair a panel at an international conference on AI. It will address the huge growth in AI's capabilities, as well as something that the frenzied conversation about AI misses out: the fact that many of its systems may well be built on a huge mess of biases, inequalities and imbalances of power. This gathering, the clunkily titled International Conference on Learning Representations, marks the first time people in the field have come together in an African country – which makes a powerful point about big tech's neglect of the global south. When Gebru talks about the way that AI "impacts people all over the world and they don't get to have a say on how they should shape it", the issue is thrown into even sharper relief by her backstory. In her teens, Gebru was a refugee from the war between Ethiopia, where she grew up, and Eritrea, where her parents were born. After a year in Ireland, she made it to the outskirts of Boston, Massachusetts, and from there to Stanford University in northern California, which opened the way to a career at the cutting edge of the computing industry: Apple, then Microsoft, followed by Google. But in late 2020, her work at Google came to a sudden end. As the co-leader of Google's small ethical AI team, Gebru was one of the authors of an academic paper that warned about the kind of AI that is increasingly built into our lives, taking internet searches and user recommendations to apparently new levels of sophistication and threatening to master such human talents as writing, composing music and analysing images. The clear danger, the paper said, is that such supposed "intelligence" is based on huge data sets that "overrepresent hegemonic viewpoints and encode biases potentially damaging to marginalised populations". Put more bluntly, AI threatens to deepen the dominance of a way of thinking that is white, male, comparatively affluent and focused on the US and Europe. In response, senior managers at Google demanded that Gebru either withdraw the paper, or take her name and those of her colleagues off it. This triggered a run of events that led to her departure. Google says she resigned; Gebru insists that she was fired. What all this told her, she says, is that big tech is consumed by a drive to develop AI and "you don't want someone like me who's going to get in your way.
- North America > United States > Massachusetts > Suffolk County > Boston (0.24)
- Europe (0.24)
- Africa > Rwanda > Kigali > Kigali (0.24)
- (5 more...)
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
- Law (1.00)
- Information Technology > Services (0.90)